Dataset statistics
| Number of variables | 13 |
|---|---|
| Number of observations | 140112 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 13.9 MiB |
| Average record size in memory | 104.0 B |
Variable types
| Numeric | 10 |
|---|---|
| Categorical | 3 |
CMD has a high cardinality: 342 distinct values | High cardinality |
ts is highly correlated with label and 1 other fields | High correlation |
label is highly correlated with ts and 1 other fields | High correlation |
type is highly correlated with ts and 2 other fields | High correlation |
MINFLT is highly correlated with MAJFLT | High correlation |
MAJFLT is highly correlated with MINFLT | High correlation |
PID is highly correlated with type | High correlation |
VSIZE is highly correlated with VGROW | High correlation |
VGROW is highly correlated with VSIZE and 1 other fields | High correlation |
RGROW is highly correlated with VGROW | High correlation |
MINFLT is highly skewed (γ1 = 339.0069078) | Skewed |
MAJFLT is highly skewed (γ1 = 135.0365829) | Skewed |
RGROW is highly skewed (γ1 = 25.81573545) | Skewed |
MINFLT has 67018 (47.8%) zeros | Zeros |
MAJFLT has 136555 (97.5%) zeros | Zeros |
VSTEXT has 7780 (5.6%) zeros | Zeros |
VSIZE has 7830 (5.6%) zeros | Zeros |
RSIZE has 7834 (5.6%) zeros | Zeros |
VGROW has 132559 (94.6%) zeros | Zeros |
RGROW has 128837 (92.0%) zeros | Zeros |
MEM has 109996 (78.5%) zeros | Zeros |
Reproduction
| Analysis started | 2022-11-12 02:05:37.704658 |
|---|---|
| Analysis finished | 2022-11-12 02:06:16.465667 |
| Duration | 38.76 seconds |
| Software version | pandas-profiling v3.4.0 |
| Download configuration | config.json |
| Distinct | 131032 |
|---|---|
| Distinct (%) | 93.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1554978473 |
| Minimum | 1554218915 |
|---|---|
| Maximum | 1556549129 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.1 MiB |
Quantile statistics
| Minimum | 1554218915 |
|---|---|
| 5-th percentile | 1554253943 |
| Q1 | 1554394054 |
| median | 1554569192 |
| Q3 | 1556208088 |
| 95-th percentile | 1556306420 |
| Maximum | 1556549129 |
| Range | 2330214 |
| Interquartile range (IQR) | 1814034.25 |
Descriptive statistics
| Standard deviation | 814232.4719 |
|---|---|
| Coefficient of variation (CV) | 0.0005236294174 |
| Kurtosis | -1.122518002 |
| Mean | 1554978473 |
| Median Absolute Deviation (MAD) | 200560 |
| Skewness | 0.8758078916 |
| Sum | 2.178711438 × 1014 |
| Variance | 6.629745182 × 1011 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1556238768 | 2 | < 0.1% |
| 1556214468 | 2 | < 0.1% |
| 1556214503 | 2 | < 0.1% |
| 1556214493 | 2 | < 0.1% |
| 1556214488 | 2 | < 0.1% |
| 1556214483 | 2 | < 0.1% |
| 1556214478 | 2 | < 0.1% |
| 1556214473 | 2 | < 0.1% |
| 1556214463 | 2 | < 0.1% |
| 1556214513 | 2 | < 0.1% |
| Other values (131022) | 140092 |
| Value | Count | Frequency (%) |
| 1554218915 | 1 | |
| 1554218920 | 1 | |
| 1554218925 | 1 | |
| 1554218930 | 1 | |
| 1554218935 | 1 | |
| 1554218940 | 1 | |
| 1554218945 | 1 | |
| 1554218950 | 1 | |
| 1554218955 | 1 | |
| 1554218960 | 1 |
| Value | Count | Frequency (%) |
| 1556549129 | 2 | |
| 1556548639 | 2 | |
| 1556548364 | 2 | |
| 1556547914 | 2 | |
| 1556547464 | 2 | |
| 1556547359 | 2 | |
| 1556547354 | 2 | |
| 1556547344 | 2 | |
| 1556547264 | 2 | |
| 1556547224 | 2 |
| Distinct | 2821 |
|---|---|
| Distinct (%) | 2.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3266.943538 |
| Minimum | 1007 |
|---|---|
| Maximum | 53096 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.1 MiB |
Quantile statistics
| Minimum | 1007 |
|---|---|
| 5-th percentile | 1371 |
| Q1 | 2533 |
| median | 3155 |
| Q3 | 3895 |
| 95-th percentile | 4801 |
| Maximum | 53096 |
| Range | 52089 |
| Interquartile range (IQR) | 1362 |
Descriptive statistics
| Standard deviation | 2119.039113 |
|---|---|
| Coefficient of variation (CV) | 0.6486304671 |
| Kurtosis | 381.9108966 |
| Mean | 3266.943538 |
| Median Absolute Deviation (MAD) | 638 |
| Skewness | 16.49806154 |
| Sum | 457737993 |
| Variance | 4490326.762 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 3790 | 4179 | 3.0% |
| 3793 | 4159 | 3.0% |
| 2774 | 3955 | 2.8% |
| 3676 | 3954 | 2.8% |
| 3675 | 3954 | 2.8% |
| 1442 | 3954 | 2.8% |
| 3677 | 3954 | 2.8% |
| 3678 | 3954 | 2.8% |
| 1371 | 3952 | 2.8% |
| 2797 | 3950 | 2.8% |
| Other values (2811) | 100147 |
| Value | Count | Frequency (%) |
| 1007 | 155 | |
| 1026 | 137 | |
| 1063 | 40 | < 0.1% |
| 1087 | 12 | < 0.1% |
| 1103 | 37 | < 0.1% |
| 1124 | 121 | |
| 1133 | 58 | < 0.1% |
| 1134 | 205 | |
| 1135 | 1 | < 0.1% |
| 1137 | 33 | < 0.1% |
| Value | Count | Frequency (%) |
| 53096 | 1 | |
| 53055 | 1 | |
| 53048 | 1 | |
| 53046 | 1 | |
| 53044 | 1 | |
| 53043 | 2 | |
| 53042 | 1 | |
| 53041 | 1 | |
| 53040 | 1 | |
| 53030 | 1 |
| Distinct | 3560 |
|---|---|
| Distinct (%) | 2.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 773.4218197 |
| Minimum | 0 |
|---|---|
| Maximum | 8050000 |
| Zeros | 67018 |
| Zeros (%) | 47.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 3 |
| Q3 | 1960 |
| 95-th percentile | 2073 |
| Maximum | 8050000 |
| Range | 8050000 |
| Interquartile range (IQR) | 1960 |
Descriptive statistics
| Standard deviation | 22333.44673 |
|---|---|
| Coefficient of variation (CV) | 28.87615291 |
| Kurtosis | 120818.217 |
| Mean | 773.4218197 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 339.0069078 |
| Sum | 108365678 |
| Variance | 498782842.9 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 67018 | |
| 1963 | 9583 | 6.8% |
| 1960 | 8600 | 6.1% |
| 8 | 5634 | 4.0% |
| 7 | 3909 | 2.8% |
| 2072 | 3301 | 2.4% |
| 3 | 3021 | 2.2% |
| 2 | 1411 | 1.0% |
| 2075 | 1372 | 1.0% |
| 9 | 1294 | 0.9% |
| Other values (3550) | 34969 |
| Value | Count | Frequency (%) |
| 0 | 67018 | |
| 1 | 1109 | 0.8% |
| 2 | 1411 | 1.0% |
| 3 | 3021 | 2.2% |
| 4 | 818 | 0.6% |
| 5 | 763 | 0.5% |
| 6 | 722 | 0.5% |
| 7 | 3909 | 2.8% |
| 8 | 5634 | 4.0% |
| 9 | 1294 | 0.9% |
| Value | Count | Frequency (%) |
| 8050000 | 1 | |
| 1900000 | 1 | |
| 859502 | 1 | |
| 356224 | 1 | |
| 296844 | 1 | |
| 247354 | 1 | |
| 156527 | 1 | |
| 156476 | 1 | |
| 156469 | 1 | |
| 156467 | 1 |
| Distinct | 261 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.203921863 |
| Minimum | 0 |
|---|---|
| Maximum | 107776 |
| Zeros | 136555 |
| Zeros (%) | 97.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 107776 |
| Range | 107776 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 462.0865911 |
|---|---|
| Coefficient of variation (CV) | 74.4829805 |
| Kurtosis | 24764.68521 |
| Mean | 6.203921863 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 135.0365829 |
| Sum | 869243.9 |
| Variance | 213524.0177 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 136555 | |
| 1 | 1417 | 1.0% |
| 2 | 429 | 0.3% |
| 3 | 243 | 0.2% |
| 4 | 196 | 0.1% |
| 5 | 145 | 0.1% |
| 6 | 75 | 0.1% |
| 8 | 68 | < 0.1% |
| 7 | 57 | < 0.1% |
| 10 | 51 | < 0.1% |
| Other values (251) | 876 | 0.6% |
| Value | Count | Frequency (%) |
| 0 | 136555 | |
| 1 | 1417 | 1.0% |
| 1.1 | 2 | < 0.1% |
| 2 | 429 | 0.3% |
| 2.9 | 1 | < 0.1% |
| 3 | 243 | 0.2% |
| 4 | 196 | 0.1% |
| 5 | 145 | 0.1% |
| 6 | 75 | 0.1% |
| 7 | 57 | < 0.1% |
| Value | Count | Frequency (%) |
| 107776 | 1 | |
| 50776 | 1 | |
| 45113 | 1 | |
| 44392 | 1 | |
| 36161 | 1 | |
| 35001 | 1 | |
| 31968 | 1 | |
| 29880 | 2 | |
| 28186 | 1 | |
| 24472 | 1 |
| Distinct | 147 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 578.7817118 |
| Minimum | 0 |
|---|---|
| Maximum | 50652 |
| Zeros | 7780 |
| Zeros (%) | 5.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 47 |
| median | 148 |
| Q3 | 589 |
| 95-th percentile | 2219 |
| Maximum | 50652 |
| Range | 50652 |
| Interquartile range (IQR) | 542 |
Descriptive statistics
| Standard deviation | 1373.721981 |
|---|---|
| Coefficient of variation (CV) | 2.373471644 |
| Kurtosis | 52.49758787 |
| Mean | 578.7817118 |
| Median Absolute Deviation (MAD) | 112 |
| Skewness | 6.149402649 |
| Sum | 81094263.2 |
| Variance | 1887112.081 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 148 | 38364 | |
| 36 | 13268 | 9.5% |
| 0 | 7780 | 5.6% |
| 10 | 5303 | 3.8% |
| 2132 | 5254 | 3.7% |
| 1350 | 5233 | 3.7% |
| 2219 | 5197 | 3.7% |
| 589 | 5149 | 3.7% |
| 596 | 4980 | 3.6% |
| 66 | 4944 | 3.5% |
| Other values (137) | 44640 |
| Value | Count | Frequency (%) |
| 0 | 7780 | |
| 3 | 58 | < 0.1% |
| 4 | 1142 | 0.8% |
| 6 | 121 | 0.1% |
| 7 | 125 | 0.1% |
| 9 | 121 | 0.1% |
| 10 | 5303 | |
| 11 | 52 | < 0.1% |
| 12 | 1932 | 1.4% |
| 13 | 36 | < 0.1% |
| Value | Count | Frequency (%) |
| 50652 | 1 | |
| 28212 | 1 | |
| 22668 | 1 | |
| 21636 | 1 | |
| 21556 | 2 | |
| 19440 | 1 | |
| 17264 | 2 | |
| 16640 | 1 | |
| 14916 | 1 | |
| 13628 | 1 |
| Distinct | 1128 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8547.21741 |
| Minimum | 0 |
|---|---|
| Maximum | 78328 |
| Zeros | 7830 |
| Zeros (%) | 5.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 114.7 |
| median | 710.1 |
| Q3 | 17948 |
| 95-th percentile | 22440 |
| Maximum | 78328 |
| Range | 78328 |
| Interquartile range (IQR) | 17833.3 |
Descriptive statistics
| Standard deviation | 11445.85971 |
|---|---|
| Coefficient of variation (CV) | 1.339132862 |
| Kurtosis | 4.195778917 |
| Mean | 8547.21741 |
| Median Absolute Deviation (MAD) | 708.7 |
| Skewness | 1.611845947 |
| Sum | 1197567726 |
| Variance | 131007704.6 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 17948 | 12639 | 9.0% |
| 1.4 | 8926 | 6.4% |
| 0 | 7830 | 5.6% |
| 19296 | 5239 | 3.7% |
| 17644 | 4503 | 3.2% |
| 2.9 | 4372 | 3.1% |
| 114.7 | 3949 | 2.8% |
| 2.4 | 3944 | 2.8% |
| 17640 | 3797 | 2.7% |
| 710.1 | 3725 | 2.7% |
| Other values (1118) | 81188 |
| Value | Count | Frequency (%) |
| 0 | 7830 | |
| 1 | 99 | 0.1% |
| 1.1 | 84 | 0.1% |
| 1.2 | 115 | 0.1% |
| 1.3 | 1468 | 1.0% |
| 1.4 | 8926 | |
| 1.5 | 82 | 0.1% |
| 1.6 | 18 | < 0.1% |
| 1.7 | 268 | 0.2% |
| 1.8 | 414 | 0.3% |
| Value | Count | Frequency (%) |
| 78328 | 117 | |
| 76860 | 31 | < 0.1% |
| 75364 | 91 | |
| 71240 | 102 | |
| 70492 | 130 | |
| 70300 | 99 | |
| 68692 | 1 | < 0.1% |
| 64028 | 1 | < 0.1% |
| 63164 | 28 | < 0.1% |
| 61728 | 4 | < 0.1% |
| Distinct | 4197 |
|---|---|
| Distinct (%) | 3.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10193.1164 |
| Minimum | 0 |
|---|---|
| Maximum | 99496 |
| Zeros | 7834 |
| Zeros (%) | 5.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1588 |
| median | 2604 |
| Q3 | 11064 |
| 95-th percentile | 51760 |
| Maximum | 99496 |
| Range | 99496 |
| Interquartile range (IQR) | 9476 |
Descriptive statistics
| Standard deviation | 17039.08105 |
|---|---|
| Coefficient of variation (CV) | 1.67162626 |
| Kurtosis | 7.631734703 |
| Mean | 10193.1164 |
| Median Absolute Deviation (MAD) | 2180 |
| Skewness | 2.711704469 |
| Sum | 1428177925 |
| Variance | 290330282.9 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 7834 | 5.6% |
| 2612 | 6681 | 4.8% |
| 2604 | 3904 | 2.8% |
| 2608 | 3896 | 2.8% |
| 98.2 | 3171 | 2.3% |
| 568 | 2834 | 2.0% |
| 13488 | 2825 | 2.0% |
| 59188 | 2798 | 2.0% |
| 8204 | 2796 | 2.0% |
| 2484 | 2462 | 1.8% |
| Other values (4187) | 100911 |
| Value | Count | Frequency (%) |
| 0 | 7834 | |
| 4 | 1 | < 0.1% |
| 12 | 2 | < 0.1% |
| 16 | 2 | < 0.1% |
| 20 | 1 | < 0.1% |
| 32 | 3 | < 0.1% |
| 48 | 1 | < 0.1% |
| 52 | 1 | < 0.1% |
| 60 | 1 | < 0.1% |
| 96 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 99496 | 4 | |
| 99280 | 1 | < 0.1% |
| 99276 | 1 | < 0.1% |
| 99264 | 1 | < 0.1% |
| 99176 | 1 | < 0.1% |
| 99044 | 1 | < 0.1% |
| 99040 | 1 | < 0.1% |
| 99024 | 2 | |
| 98936 | 2 | |
| 98920 | 4 |
| Distinct | 556 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 401.7808125 |
| Minimum | 0 |
|---|---|
| Maximum | 99268 |
| Zeros | 132559 |
| Zeros (%) | 94.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 4 |
| Maximum | 99268 |
| Range | 99268 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 4378.613411 |
|---|---|
| Coefficient of variation (CV) | 10.89801522 |
| Kurtosis | 178.2719434 |
| Mean | 401.7808125 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 12.83247441 |
| Sum | 56294313.2 |
| Variance | 19172255.4 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 132559 | |
| 4 | 1397 | 1.0% |
| 132 | 637 | 0.5% |
| 256 | 628 | 0.4% |
| 128 | 590 | 0.4% |
| 40 | 474 | 0.3% |
| 29840 | 353 | 0.3% |
| 51980 | 240 | 0.2% |
| 8 | 188 | 0.1% |
| 1024 | 168 | 0.1% |
| Other values (546) | 2878 | 2.1% |
| Value | Count | Frequency (%) |
| 0 | 132559 | |
| 0.2 | 14 | < 0.1% |
| 0.4 | 2 | < 0.1% |
| 1 | 2 | < 0.1% |
| 1.3 | 2 | < 0.1% |
| 1.4 | 2 | < 0.1% |
| 1.5 | 2 | < 0.1% |
| 1.7 | 1 | < 0.1% |
| 2 | 1 | < 0.1% |
| 2.3 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 99268 | 2 | |
| 90188 | 1 | < 0.1% |
| 88672 | 1 | < 0.1% |
| 83668 | 1 | < 0.1% |
| 81740 | 2 | |
| 78568 | 1 | < 0.1% |
| 78328 | 4 | |
| 77484 | 1 | < 0.1% |
| 76860 | 2 | |
| 76552 | 1 | < 0.1% |
| Distinct | 1782 |
|---|---|
| Distinct (%) | 1.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 142.9474956 |
| Minimum | 0 |
|---|---|
| Maximum | 98920 |
| Zeros | 128837 |
| Zeros (%) | 92.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 68 |
| Maximum | 98920 |
| Range | 98920 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1543.19942 |
|---|---|
| Coefficient of variation (CV) | 10.79556808 |
| Kurtosis | 1070.48339 |
| Mean | 142.9474956 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 25.81573545 |
| Sum | 20028659.5 |
| Variance | 2381464.451 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 128837 | |
| 4 | 1053 | 0.8% |
| 8 | 599 | 0.4% |
| 12 | 383 | 0.3% |
| 20 | 326 | 0.2% |
| 952 | 306 | 0.2% |
| 16 | 296 | 0.2% |
| 24 | 216 | 0.2% |
| 260 | 178 | 0.1% |
| 28 | 151 | 0.1% |
| Other values (1772) | 7767 | 5.5% |
| Value | Count | Frequency (%) |
| 0 | 128837 | |
| 0.1 | 2 | < 0.1% |
| 0.4 | 1 | < 0.1% |
| 4 | 1053 | 0.8% |
| 8 | 599 | 0.4% |
| 9.8 | 2 | < 0.1% |
| 10 | 4 | < 0.1% |
| 10.1 | 6 | < 0.1% |
| 10.2 | 9 | < 0.1% |
| 10.3 | 8 | < 0.1% |
| Value | Count | Frequency (%) |
| 98920 | 1 | |
| 98716 | 1 | |
| 97180 | 1 | |
| 94400 | 1 | |
| 92864 | 1 | |
| 88680 | 1 | |
| 87548 | 1 | |
| 77052 | 1 | |
| 68140 | 1 | |
| 61784 | 1 |
| Distinct | 15 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.004468782117 |
| Minimum | 0 |
|---|---|
| Maximum | 0.16 |
| Zeros | 109996 |
| Zeros (%) | 78.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0.03 |
| Maximum | 0.16 |
| Range | 0.16 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.01158754176 |
|---|---|
| Coefficient of variation (CV) | 2.592997701 |
| Kurtosis | 23.00815332 |
| Mean | 0.004468782117 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 4.124365351 |
| Sum | 626.13 |
| Variance | 0.000134271124 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=15)
| Value | Count | Frequency (%) |
| 0 | 109996 | |
| 0.01 | 15796 | 11.3% |
| 0.02 | 5999 | 4.3% |
| 0.03 | 5389 | 3.8% |
| 0.06 | 864 | 0.6% |
| 0.05 | 646 | 0.5% |
| 0.07 | 597 | 0.4% |
| 0.04 | 302 | 0.2% |
| 0.08 | 222 | 0.2% |
| 0.09 | 141 | 0.1% |
| Other values (5) | 160 | 0.1% |
| Value | Count | Frequency (%) |
| 0 | 109996 | |
| 0.01 | 15796 | 11.3% |
| 0.02 | 5999 | 4.3% |
| 0.03 | 5389 | 3.8% |
| 0.04 | 302 | 0.2% |
| 0.05 | 646 | 0.5% |
| 0.06 | 864 | 0.6% |
| 0.07 | 597 | 0.4% |
| 0.08 | 222 | 0.2% |
| 0.09 | 141 | 0.1% |
| Value | Count | Frequency (%) |
| 0.16 | 1 | < 0.1% |
| 0.15 | 3 | < 0.1% |
| 0.14 | 36 | < 0.1% |
| 0.11 | 41 | < 0.1% |
| 0.1 | 79 | 0.1% |
| 0.09 | 141 | 0.1% |
| 0.08 | 222 | 0.2% |
| 0.07 | 597 | |
| 0.06 | 864 | |
| 0.05 | 646 |
| Distinct | 342 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| atop | |
|---|---|
| vmtoolsd | |
| compiz | 5279 |
| ostinato | 5254 |
| irqbalance | 5241 |
| Other values (337) |
Length
| Max length | 19 |
|---|---|
| Median length | 13 |
| Mean length | 7.78562864 |
| Min length | 2 |
Characters and Unicode
| Total characters | 1090860 |
|---|---|
| Distinct characters | 52 |
| Distinct categories | 10 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 69 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Xorg |
|---|---|
| 2nd row | haveged |
| 3rd row | kworker/0:1 |
| 4th row | <kworker/u256> |
| 5th row | compiz |
Common Values
| Value | Count | Frequency (%) |
| atop | 37995 | |
| vmtoolsd | 8028 | 5.7% |
| compiz | 5279 | 3.8% |
| ostinato | 5254 | 3.7% |
| irqbalance | 5241 | 3.7% |
| nautilus | 5234 | 3.7% |
| Xorg | 5197 | 3.7% |
| hud-service | 5149 | 3.7% |
| apache2 | 5004 | 3.6% |
| drone | 4943 | 3.5% |
| Other values (332) | 52788 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| atop | 38004 | |
| vmtoolsd | 8028 | 5.7% |
| compiz | 5289 | 3.8% |
| ostinato | 5254 | 3.7% |
| irqbalance | 5241 | 3.7% |
| nautilus | 5241 | 3.7% |
| xorg | 5197 | 3.7% |
| hud-service | 5149 | 3.7% |
| apache2 | 5116 | 3.7% |
| drone | 4943 | 3.5% |
| Other values (286) | 52668 |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 113488 | 10.4% |
| t | 102172 | 9.4% |
| a | 101358 | 9.3% |
| e | 77887 | 7.1% |
| p | 75237 | 6.9% |
| n | 58440 | 5.4% |
| s | 57318 | 5.3% |
| i | 52241 | 4.8% |
| d | 49978 | 4.6% |
| u | 45992 | 4.2% |
| Other values (42) | 356749 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 998649 | |
| Dash Punctuation | 45210 | 4.1% |
| Decimal Number | 22559 | 2.1% |
| Other Punctuation | 9267 | 0.8% |
| Uppercase Letter | 9051 | 0.8% |
| Math Symbol | 6029 | 0.6% |
| Connector Punctuation | 75 | < 0.1% |
| Space Separator | 18 | < 0.1% |
| Open Punctuation | 1 | < 0.1% |
| Close Punctuation | 1 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 113488 | |
| t | 102172 | 10.2% |
| a | 101358 | 10.1% |
| e | 77887 | 7.8% |
| p | 75237 | 7.5% |
| n | 58440 | 5.9% |
| s | 57318 | 5.7% |
| i | 52241 | 5.2% |
| d | 49978 | 5.0% |
| u | 45992 | 4.6% |
| Other values (16) | 264538 |
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 10981 | |
| 5 | 3819 | 16.9% |
| 6 | 3772 | 16.7% |
| 1 | 1669 | 7.4% |
| 0 | 1661 | 7.4% |
| 3 | 650 | 2.9% |
| 4 | 3 | < 0.1% |
| 8 | 2 | < 0.1% |
| 7 | 2 | < 0.1% |
Uppercase Letter
| Value | Count | Frequency (%) |
| X | 5197 | |
| C | 1278 | 14.1% |
| W | 1258 | 13.9% |
| M | 699 | 7.7% |
| N | 617 | 6.8% |
| T | 2 | < 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 4662 | |
| : | 4584 | |
| . | 21 | 0.2% |
Math Symbol
| Value | Count | Frequency (%) |
| < | 3017 | |
| > | 3010 | |
| ~ | 2 | < 0.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 45210 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 75 |
Space Separator
| Value | Count | Frequency (%) |
| 18 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 1 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1007700 | |
| Common | 83160 | 7.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| o | 113488 | |
| t | 102172 | 10.1% |
| a | 101358 | 10.1% |
| e | 77887 | 7.7% |
| p | 75237 | 7.5% |
| n | 58440 | 5.8% |
| s | 57318 | 5.7% |
| i | 52241 | 5.2% |
| d | 49978 | 5.0% |
| u | 45992 | 4.6% |
| Other values (22) | 273589 |
Common
| Value | Count | Frequency (%) |
| - | 45210 | |
| 2 | 10981 | 13.2% |
| / | 4662 | 5.6% |
| : | 4584 | 5.5% |
| 5 | 3819 | 4.6% |
| 6 | 3772 | 4.5% |
| < | 3017 | 3.6% |
| > | 3010 | 3.6% |
| 1 | 1669 | 2.0% |
| 0 | 1661 | 2.0% |
| Other values (10) | 775 | 0.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1090860 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| o | 113488 | 10.4% |
| t | 102172 | 9.4% |
| a | 101358 | 9.3% |
| e | 77887 | 7.1% |
| p | 75237 | 6.9% |
| n | 58440 | 5.4% |
| s | 57318 | 5.3% |
| i | 52241 | 4.8% |
| d | 49978 | 4.6% |
| u | 45992 | 4.2% |
| Other values (42) | 356749 |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 140112 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 100000 | |
| 1 | 40112 |
Length
Histogram of lengths of the category
Category Frequency Plot
| Value | Count | Frequency (%) |
| 0 | 100000 | |
| 1 | 40112 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 100000 | |
| 1 | 40112 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 140112 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 100000 | |
| 1 | 40112 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 140112 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 100000 | |
| 1 | 40112 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 140112 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 100000 | |
| 1 | 40112 |
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.1 MiB |
| normal | |
|---|---|
| dos | 10000 |
| ddos | 10000 |
| injection | 10000 |
| password | 10000 |
Length
| Max length | 9 |
|---|---|
| Median length | 6 |
| Mean length | 5.998401279 |
| Min length | 3 |
Characters and Unicode
| Total characters | 840448 |
|---|---|
| Distinct characters | 15 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | dos |
|---|---|
| 2nd row | dos |
| 3rd row | dos |
| 4th row | dos |
| 5th row | dos |
Common Values
| Value | Count | Frequency (%) |
| normal | 100000 | |
| dos | 10000 | 7.1% |
| ddos | 10000 | 7.1% |
| injection | 10000 | 7.1% |
| password | 10000 | 7.1% |
| mitm | 112 | 0.1% |
Length
Histogram of lengths of the category
Category Frequency Plot
| Value | Count | Frequency (%) |
| normal | 100000 | |
| dos | 10000 | 7.1% |
| ddos | 10000 | 7.1% |
| injection | 10000 | 7.1% |
| password | 10000 | 7.1% |
| mitm | 112 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 140000 | |
| n | 120000 | |
| r | 110000 | |
| a | 110000 | |
| m | 100224 | |
| l | 100000 | |
| d | 40000 | 4.8% |
| s | 40000 | 4.8% |
| i | 20112 | 2.4% |
| t | 10112 | 1.2% |
| Other values (5) | 50000 | 5.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 840448 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 140000 | |
| n | 120000 | |
| r | 110000 | |
| a | 110000 | |
| m | 100224 | |
| l | 100000 | |
| d | 40000 | 4.8% |
| s | 40000 | 4.8% |
| i | 20112 | 2.4% |
| t | 10112 | 1.2% |
| Other values (5) | 50000 | 5.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 840448 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| o | 140000 | |
| n | 120000 | |
| r | 110000 | |
| a | 110000 | |
| m | 100224 | |
| l | 100000 | |
| d | 40000 | 4.8% |
| s | 40000 | 4.8% |
| i | 20112 | 2.4% |
| t | 10112 | 1.2% |
| Other values (5) | 50000 | 5.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 840448 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| o | 140000 | |
| n | 120000 | |
| r | 110000 | |
| a | 110000 | |
| m | 100224 | |
| l | 100000 | |
| d | 40000 | 4.8% |
| s | 40000 | 4.8% |
| i | 20112 | 2.4% |
| t | 10112 | 1.2% |
| Other values (5) | 50000 | 5.9% |
Auto
The auto setting is an easily interpretable pairwise column metric of the following mapping: vartype-vartype : method, categorical-categorical : Cramer's V, numerical-categorical : Cramer's V (using a discretized numerical column), numerical-numerical : Spearman's ρ. This configuration uses the best suitable for each pair of columns.Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| ts | PID | MINFLT | MAJFLT | VSTEXT | VSIZE | RSIZE | VGROW | RGROW | MEM | CMD | label | type | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1556129658 | 1494 | 0 | 0.0 | 2219.0 | 390.0 | 82020.0 | 0.0 | 0.0 | 0.02 | Xorg | 1 | dos |
| 1 | 1556129738 | 1641 | 0 | 0.0 | 12.0 | 9480.0 | 3496.0 | 0.0 | 0.0 | 0.00 | haveged | 1 | dos |
| 2 | 1556129778 | 6604 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.00 | kworker/0:1 | 1 | dos |
| 3 | 1556129788 | 51017 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.00 | <kworker/u256> | 1 | dos |
| 4 | 1556129798 | 2766 | 0 | 0.0 | 10.0 | 1.3 | 68724.0 | 0.0 | 0.0 | 0.02 | compiz | 1 | dos |
| 5 | 1556129823 | 3144 | 0 | 0.0 | 2132.0 | 2.5 | 25800.0 | 0.0 | 0.0 | 0.01 | ostinato | 1 | dos |
| 6 | 1556129898 | 1424 | 10 | 0.0 | 36.0 | 19296.0 | 692.0 | 0.0 | 0.0 | 0.00 | irqbalance | 1 | dos |
| 7 | 1556129913 | 52779 | 5 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.00 | <systemd-host> | 1 | dos |
| 8 | 1556129923 | 2766 | 0 | 0.0 | 10.0 | 1.3 | 68724.0 | 0.0 | 0.0 | 0.02 | compiz | 1 | dos |
| 9 | 1556129933 | 1473 | 0 | 0.0 | 10415.0 | 607.7 | 34644.0 | 0.0 | 0.0 | 0.01 | mysqld | 1 | dos |
Last rows
| ts | PID | MINFLT | MAJFLT | VSTEXT | VSIZE | RSIZE | VGROW | RGROW | MEM | CMD | label | type | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 140102 | 1554256745 | 3675 | 2076 | 0.0 | 148.0 | 17948.0 | 2604.0 | 0.0 | 0.0 | 0.00 | atop | 0 | normal |
| 140103 | 1554256750 | 3676 | 2076 | 0.0 | 148.0 | 18016.0 | 2580.0 | 0.0 | 0.0 | 0.00 | atop | 0 | normal |
| 140104 | 1554256755 | 4375 | 2077 | 0.0 | 148.0 | 17952.0 | 2464.0 | 0.0 | 0.0 | 0.00 | atop | 0 | normal |
| 140105 | 1554256760 | 4374 | 2077 | 0.0 | 148.0 | 17948.0 | 2456.0 | 0.0 | 0.0 | 0.00 | atop | 0 | normal |
| 140106 | 1554256765 | 4372 | 2076 | 0.0 | 148.0 | 17644.0 | 2412.0 | 0.0 | 0.0 | 0.00 | atop | 0 | normal |
| 140107 | 1554256770 | 4373 | 2076 | 0.0 | 148.0 | 17648.0 | 2408.0 | 0.0 | 0.0 | 0.00 | atop | 0 | normal |
| 140108 | 1554256775 | 1851 | 0 | 0.0 | 36.0 | 159.8 | 2000.0 | 0.0 | 0.0 | 0.00 | vmtoolsd | 0 | normal |
| 140109 | 1554256780 | 1371 | 5 | 0.0 | 36.0 | 19296.0 | 640.0 | 0.0 | 0.0 | 0.00 | irqbalance | 0 | normal |
| 140110 | 1554256785 | 1668 | 0 | 0.0 | 12.0 | 9480.0 | 292.0 | 0.0 | 0.0 | 0.00 | haveged | 0 | normal |
| 140111 | 1554256790 | 1442 | 1902 | 22.0 | 2219.0 | 811.2 | 255.7 | 11764.0 | 15736.0 | 0.07 | Xorg | 0 | normal |